In-silico studies of inhibitory compounds against protease enzymes of SARS-CoV-2

In December 2019, a COVID-19 outbreak caused by SARS-CoV-2 raised worldwide health concerns. In this case, molecular docking and drug repurposing computational approaches were engaged to check the efficiency of plant-based inhibitory compounds against SARS-CoV-2 main protease enzyme and papain-like protease enzyme. Twenty phytochemical inhibitory compounds were collected. Then these compounds were screened based on Lipinski’s rule. As a result of this screening eleven compounds were further selected. Quantitative structure–activity relationships analysis was done before molecular docking to check especially the antiviral activity of inhibitory compounds. Docking validation of these compounds was checked by using online server Database of Useful Decoys: Enhanced. Binding affinity value, and pharmacokinetic properties of Aloin compound indicated that it can be used against main protease enzyme of SARS-CoV-2. So, it makes it a promising compound to follow further in cell and biochemical-based assays to explore its potential use against COVID-19.


Introduction
In China, on December 31, 2019, a cluster of pneumonia cases was reported in people concerned with the seafood wholesale market in Wuhan, Hubei Province. On January 7, 2020, it was confirmed by Chinese health authorities that this cluster is related to a novel coronavirus, 2019-nCo. [1] Name as Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) as linked with severe respiratory illness. The epidemic record shows that viruses can cause a wide range of illness from mild to severe symptoms which include death.
Coronaviruses belong to the family Coronaviridae and subfamily Coronavirinae, which are positively stranded ribonucleic acid (RNA) enveloped viruses having spikes of glycoproteins prognostic from their viral envelopes, displaying a corona or halo-like appearance. [2] It was declared as a Public Health Emergency of International Concern by the World Health Organization (WHO) in January 2020 and Pandemic by WHO on March 11, 2020. [3] Signs and symptoms of SARS-CoV-2 infection are fever, nasal discharge, frequent coughs, and sore throat. In case of a fatal condition, acute respiratory distress, pneumonia, and multiple organ disaster may occur. [4] These symptoms are assumed to be similar to that of influenza, consisting of itching of the throat, running of nose, cough, fatigue, headache, and shortness of breath. In case of infection of COVID 19 signs persist for a longer time. [5] In SARS-CoV-2 three types of proteins are coded by genome. These proteins are structural, nonstructural proteins, and accessory proteins. nonstructural proteins include Chymotrypsin like the main protease enzyme, papain-like protease and RNA-dependent polymerase. nonstructural proteins are mostly proving to be helpful in the virus life cycle. Spike (S) glycoproteins are structural proteins. These proteins play a key role in helping viral interaction with host cell receptors during its entry into the host cell. [6] Moreover, for entry of coronavirus into the host cell, viral RNA must be translated by using the host cell machinery, which gives rise to virus-encoded proteins of diverse open reading frames. Open reading frames are further translated into two polyproteins that is, pp1a and pp1b. Processing of these two polyproteins gives rise to sixteen mature nonstructural proteins. Chymotrypsin like the main protease enzyme is used for the processing of thirteen nonstructural proteins. Papain-like protease enzyme is used for cleavage of three nonstructural proteins (nsp1, nsp2, and nsp3). [7] Two main Medicine activities are performed by the PL-pro enzyme. These two activities are the removal of ubiquitin and ubiquitin-like protein (interferon-induced gene) IIG15 from cellular proteins. PL-pro also can inhibit the production of chemokines and cytokines. Both these chemicals are essential for activating the host immune response against infection of the virus. In this way, papain-like protease enzyme is considered a novel target to develop a drug against COVID-19 disease. [8] Some antiviral drugs already used in the treatment of Middle East Respiratory Syndrome (MERS)-CoV and SARS-CoV may also be used in the treatment of SARS-CoV-2. Lopinavir, ritonavir, Remedsivir, and antimalarial drugs like chloroquine and certain drugs may also be used in the treatment of COVID 19. Remdesivir is a broad-spectrum antiviral prodrug that has antiviral in vitro activity against a set of RNA viruses such as SARS-Co-V, MERS CoV, Ebola virus, Hendra virus, and Nipah virus. [9] The plant produces compounds known as phytochemical compounds, which have antimicrobial activity. Recently, a variety of inhibitory compounds are taken from plants that shows antibacterial, antifungal, and antiviral activity. [10] These compounds may be flavonoids, alkaloids, flavones, phenols, and polyphenols. Phytochemical compounds have high potential and beneficial properties against viral infection and other complications related to health. These phytochemical compounds are considered as the best therapeutic agent against the coronavirus. These compounds can inhibit the replication process of the virus and also block the viral entry into host. Our study involved the screening of inhibitory compounds showing high affectivity against coronavirus protease enzyme. For screening purposes in silico study is mostly used. In silico studies are used to check that these inhibitory compounds have a strong binding affinity with the protease enzyme of SARS-CoV-2. [11] To enable the rapid discovery of antiviral compounds with clinical potential, we developed an approach that combines structure-assisted drug design, computer-generated drug screening, and high-throughput screening to repurpose existing drugs by using SARS-CoV-2 main protease enzyme (3CLpro) and papain-like protease enzyme (PLpro) as target. [12] Molecular docking studies are used to understand the interaction between inhibitory compounds and protein molecules. Different Autodock tools are used to check the binding mode of inhibitory compounds with the target protein.

Preparation of target
To perform In-silico studies of ligand-receptor interaction, 3D crystallized structure of 3CLpro protein data bank (PDB ID: 7BRO) and PLpro (PDB ID: 4RNA) were downloaded from Research Collaboratory for Structural Bioinformatics, protein data bank. Discovery studio was used to remove water and ligand molecules from the protease enzymes.

Selection of ligand
A Library of 20 phytochemical inhibitory compounds was made. These inhibitory compounds may be alkaloid, flavonoid, secridoids glycosides, flavanol, and phenolic compounds, etc. These compounds were then screened against the protease enzyme of SARS-CoV-2. The 3D structure of these compounds was obtained from PubChem in Simulation Description Format. List of twenty phytochemical inhibitory compounds with their pubchem compound ID number were shown in Table 1.

Quantitative structure-activity relationship (QSAR) analysis
The bioactivity of the eleven inhibitory compounds (table in result section) was screened using QSAR analysis using the Way2drug/PASS server (http://www.way2drug.com/ PASSOnline/). Specifically, Antiviral activity of eleven inhibitory compounds and reference compound was determined by using this online software. [28]

Molecular docking
Discovery Studio was used for visualization of the 3D structure of protein and a graphic view of ligand. Modifications of protein and ligand were also achieved by using discovery studio. The modifications include the addition of polar or non-polar hydrogen atoms, removal of water molecules, and substitution of amino acids. [29] Auto Dock tools 1.5.6 (https://autodock.scripps.edu/ download-autodock4/) were used to explore the binding mode of inhibitory compounds onto a 3D model of protease enzyme (7BRO/4RNA) of SARS CoV 2. These Auto Dock tools were accessible from "the Scripps Research Institute" along with Auto Table 1 List of phytochemical inhibitory compounds with their pubchem CIDs.
Then values of the grid box were set in a very careful way. The spacing of the grid box was set up to 1.000 A (angstrom). For each inhibitory compound, dimensions (x, y, z) of the grid box were modified. Values of these dimensions (x, y, z) were set in such a way that it covers the whole targeted protease enzyme. Center grid box spacing for all three axes (x, y, z) was also set for each inhibitory compound. These pdbqt files of compound and protein were then used for running a command in Vina (https://vina.scripps.edu/). Open Source Molecular Visualization system (PyMol) software was used for the 3D visualization of ligand-protein interaction. The binding mode of a compound with amino acids residues of protein was visualized with help of PyMol. [31]

Docking validation
There are two methods to validate the docking method.
1. Re-docking In case of presence of naturally attached ligand to protease enzyme (4RNA/7BRO) re-docking was done. Inhibitory compounds were removed from main and papain like protease enzyme and then re-docked again using Autodock tool 1.5.6. In case of re-docking co-crystalized complex was opened into a new notepad by removing heteroatoms from main protease enzyme. Same procedure was used without changing the Grid values. Binding affinity obtained by re-docking must show little deviation from actual value of co-crystalized complex.
2. Validation by Database of Useful Decoys: Enhanced (DUDE.E) database In this case, an online automated tool (http://decoys.docking. org) was used to generate decoys of provided ligand. Decoys are compounds having similar physical properties as to the reference ligand that may not bind efficiently to the target enzyme. It was done to enhance ligand enrichment, which is crucial to assess the docking method and to exclude false positives. Almost, 102 targets were available on this online server. So properties of generated ligand decoys were then compared with these targets. [32] 2.6. Adsorption, distribution, metabolism, excretion, toxicity prediction Pharmacodynamics properties of inhibitory compounds were analyzed by online software SwissADME (http://www. swissadme.ch). For this purpose, canonical SMILES acquired from the PubChem database (https://pubchem.ncbi.nlm.nih. gov/) were given to online software SwissADME, and results about drug-likeness and pharmacokinetic properties were calculated. [33]

Toxicity prediction
Toxicity of inhibitory compounds was analyzed by online software ToxiM (http://metagenomics.iiserb.ac.in/ToxiM) For this purpose, PDB or Simulation Description Format of desired compound or pubchem compound ID number of compound was given to the software ToxiM, and results about toxicity was determined in form of fingerprints, hybrids or descriptors.

Quantitative structure-activity relationships (QASAR) analysis
Initially, 20 phytochemical compounds were screened. But only eleven compounds follow the Lipinski's rule. These eleven compounds were listed in Table 2. To evaluate the bioactivity of these compounds QASAR analysis was performed. It was determined by QASAR analysis that eight compounds (Aloin, Isovitexin. Myricetin, Carpaine Kaempferol, Piperitol, Apigenin stilbene) and one reference compound show the antiviral activity. Three compounds, Cusparine, psoralidin and lignans do not show the antiviral activity as their Pa value is less than 0.3. While compounds Aloin, Isovitexin. Myricetin, Carpaine Kaempferol, Piperitol, Apigenin stilbene showed the antiviral activity with Pa value between 0.3 and 0.7. So these compounds can be used Table 2 List of compounds that passed Lipinski's rule.

No.
Compounds  Table 3 List of Compounds selected on basis of QASAR analysis (Determined by Way2drug/PASS server.
Carpaine 0.4 QASAR = quantitative structure-activity relationships. Medicine as inhibitors against protease enzyme. List of eight compounds selected by QASAR analysis was shown in Table 3.

Docking
As protease enzymes play a vital role in the replication of coronavirus. So these protease enzymes were used as a target for the discovery of drugs. In this study, different types of phytochemical inhibitory compounds were docked with the main protease enzyme and papain-like protease enzyme of SARS-CoV 2. In our study, one approved drug Remdesivir previously used in the treatment of Ebola virus now also used in the treatment of SARS-CoV-2 infection was used as a positive control for screening of antiviral phytochemical compound as a potential drug. Biovia Discovery studio (https://discover.3ds.com/discovery-studio-visualizer-download) and AUTODOCK tools were used to predict the interaction between all the listed inhibitory compounds and protease enzyme (3CLpro: PDB ID: 7BRO; PLpro: PDB ID: 4RN4 of SARS-CoV-2. In this study, ten compounds were docked with protease enzyme (7BRO/4RNA) of SARS-CoV-2 and their molecular interection was examined by an offline tool Pymol https://pymol.org/2/. (Figures 1-6). Binding affinities of selected eight antiviral medicinal plant compounds with one standard protease inhibitor (Remdesivir) were shown in Table 4.

Docking validation
Re-docking was done to check the validity of docking procedure when natural ligand is attached to protease ezyme. Aloin,myreciten compounds bind to the active site of main protease enzyme with good binding affanity value of −8.7 and −8.4 Kcal/mol when redocked. Similarly, piperitol (https://pubchem.ncbi.nlm.nih.gov/compound/Piperitol) showed the binding affinity of −9.1 when re-docked with papain like protease enzyme. Decoys are the compounds having similar physical properties as ligand but different chemical properties. Total 102 targets are present in DUDE.E server. We have selected the fifteen protease enzyme as target. Then the decoys of aloin compound was genertaed. Decoys of this compound was then compared with targets decoys and ligands.
All targets of our ligand compounds are listed in Table 5. Gene ID, protein data bank (Pdb), matched and experimental decoys and ligands are also listed in this table. As in this study, Aloin was considered as lead compound in case of main protease enzyme target because of its high binding energy value of −8.7 Kcal/mol as compared to reference compound remdesivir. Moreover, piperitol was considered as lead compound in case of papain like protease enzyme due to its high binding energy value of −9.1 Kcal/mol. So, decoys of lead compound were generated by online tool DUDE.E to enhance the ligand enrichment. As  in the original DUD, we property-matched decoys to ligands using molecular weight, estimated water-octanol partition coefficient (miLogP), rotatable bonds, hydrogen bond acceptors, and hydrogen bond donors, plus we added net charge. About 52 decoys of Aloin and 50 decoys of piperitol were generated which increase its enrichment capacity and make it possible to use against main and papin like protease enzyme respectively.

Drug likeness properties analysis of inhibitory compounds
Physicochemical, pharmacokinetic properties, lipophilicity, water-solubility, pharmacokinetics, medicinal chemistry, and toxicity of nine inhibitory compounds found by docking were analyzed by SwissADME. The water solubility of these compounds revealed that Some of the inhibitory compounds were more soluble and some were moderately soluble. Other important properties like molecular weight, topological surface area (TPSA), Molecular refractivity were also studied by SwissADME. Absorption, distribution, metabolism, and excretion (ADME) properties of some inhibitory compounds were listed in Table 6:

Toxicity prediction
For an inhibitory compound to be used as a drug, it must be less toxic and less mutagenic. So, the Toxicity of all these selected inhibitory compounds was analyzed by toxicity prediction tool for small molecules ToxiM. Some of these inhibitory compounds were less toxic and some show moderate and high level of toxicity. A Compound with classification score greater than 0.8 was considered as more toxic. Aloin compound was moderately toxic as compared to Myricetin and Isovitexin. Similarly, toxicity score value of Piperitol was less than 0.8. Toxicity score of some inhibitory compounds was listed in Table 7:

Discussion
Plants are widely used in the drug discovery process due to their massive medicinal properties. Moreover, plants are a natural source of phenols, alkaloids, flavonoids, terpenes, steroids, lignans, secoiridoid glycosides, and polyketides. Plants have anti-microbial and antiviral activities due to the presence of these bioactive compounds. Researchers are using these medicinal plants for the discovery of drugs. Since bioactive compounds obtained from plants can be targeted against a disease causing protein. [35] For prediction of the binding site and drug designing in a short time, the molecular docking technique is used. To make a stable complex structure with ligand compounds, protein-ligand docking is mostly used. [36] For replication of coronavirus, proteolytic processing of protease enzyme is essential. If protease enzymes are not properly working it inhibits the replication of the virus. [37] Thus, a variety of bioactive compounds obtained from plants were targeted against protease enzymes of coronavirus by using in silico study. There are two types of protease enzymes, main protease enzyme, and papain-like protease enzyme. [38] Remedsivir is a broad-spectrum antivirus, which has antiviral activity against a set of viruses like Nipah virus, Hendra virus, respiratory syncytial MERS, and SARS-CoV. [9] In our study, twenty phytochemical compounds were screened on basis of Lipinski's rule. Only eleven compounds follow the Lipinski's rule. To evaluate the bioactivity of these compounds QASAR analysis was performed. It was determined by QASAR analysis that eight compounds and one reference compound show the antiviral activity. Three compounds, Cusparine, psoralidin and lignans do not show the antiviral activity. While compounds Aloin, Isovitexin. Myricetin, Carpaine Kaempferol, Piperitol, Apigenin showed the antiviral activity with Pa value between 0.3 and 0.7. So these compounds can be used as inhibitors against protease enzyme. So, molecular docking of these compounds was done by Autodock.
In case of docking with 3CLpro of SARS CoV-2, compounds like Aloin (−8.7 Kcal/mol), Myricetin (−8.4 Kcal/mol), Isovitexin (−8.2 Kcal/mol), Carpain (−8.0 Kcal/mol), had highly negative binding affinity values as compared to the standard inhibitory compound Remedsivir (−7.9 Kcal/mol). Moreover, in the case of docking with papain-like protease enzyme of SARS COV-2, compounds like Piperitol (−9.1 Kcal/mol), Isovitexin (−7.9), Cusparine (−7.8 Kcal/mol), had highly negative binding affinity values as compared to standard compound Remdesivir (−7.6 Kcal/mol). Validity of docking method as checked by DUDE.E online server indicated that Aloin compound had 52 decoys which increased its enrichment capacity and make it possible to use against main protease enzyme of SARS-CoV-2. ADME properties of these compounds were determined by online software Swiss ADME. Physicochemical properties, lipophilicity, water-solubility, pharmacokinetics, drug-likeness property, and medicinal chemistry of all these ligand compounds were determined. Piperitol has high binding affinity value of 9.1 Kcal/mol against papain like protease enzyme. This compound had low toxicity and also follow the Lipinski's rule. Similarly, Aloin compound had highly negative binding affinity value of −8.7 Kcal/mol against main protease enzyme of SARS-CoV-2. As the main protease enzyme had major rule in viral replication as compared to papain like protease enzyme. Therefore, Aloin compound may be hypothetically selected as drug against protease enzyme of SARS-CoV-2. So, this compound can be further tested by biochemical assays, as its medicinal chemistry exclaims that it has zero Pan-Assay Interference Compounds (PAINS) alert. This compound follows Lipinski's rule of 5 and the LD50 value for this compound was found to be 500 mg/kg. So, this compound has a moderate level of toxicity and can be used against COVID-19. [39]

Conclusion
In this excessive health emergency that the world is fronting, a solution has to be established quickly to protect lives around the sphere. While some researchers are searching repairable molecules using synthesis techniques, another way is to encourage customary medicine for incisive lead compounds (hits) from  1 Kcal/mol), followed by Isovitexin (−7.9 Kcal/mol) and Cusparine (−7.8 Kcal/mol). As main protease enzyme had the major role in viral replication so the Aloin compound was considered best inhibitor against main protease enzyme due to its highly negative binding affinity value compared to reference compound. Number of decoys generated by Aloin compound also increased its enrichment capacity and made it possible to use as best inhibitor against main protease enzyme Moreover, it follows the Lipinski's rule and other pharmacokinetic properties. This compound also had the moderate toxicity value. So, it makes it a promising compound to follow further Table 5 Overview of representive targets (availble on online server DUDE.E). GeneID  description  Total ligands  Clustered ligands  Experimental decoys  Matched decoys   Protease  ace  Angiotensin-converting enzyme  749  282  55  16,900  Protease  Ada17  ADAM17  1341  532  31  35,900  Protease  Bace1  Beta-secretase1  595  283  41  18,100  Protease  Casp3  Caspase-3  470  199  37